Corpus: cat_wikipedia_2014_100K

Other corpora

5.1.18 Words nearly always as next neighbors

Strong NN co-occurrences with a low probability of being separated

The quotient below is calculated as freq(word1)*freq(word1)/NN_freq^2.

Word 1 Word 1 Frequency of word 1 Frequency of word 2 Frequency as NN Qoutient
unitats familiars 1010 935 853 1.30
Estats Units 521 493 474 1.14
Buenos Aires 41 42 41 1.02
Sri Lanka 18 13 13 1.38
Bangla Desh 6 7 6 1.17
trenc d'alba 6 7 6 1.17
Encyclopædia Britannica 5 6 5 1.20
Menéndez Pelayo 8 6 6 1.33
Ursus americanus 7 6 6 1.17
La Gazzetta dello 6 6 6 1.00
Gazzetta dello 6 6 6 1.00
Dalai Lama 4 5 4 1.25
Fabià Puigserver 4 5 4 1.25
Mossos d'Esquadra 7 5 5 1.40
Lagopus muta 4 5 4 1.25
Tel Aviv 4 4 4 1.00
Desa Bhoj 3 4 3 1.33
Sinn Féin 4 4 4 1.00
Sant Hilari Sacalm 5 4 4 1.25
Nuestra Señora 5 4 4 1.25
523 msec needed at 2024-07-21 14:05